Path Integral Policy Improvement with Covariance Matrix Adaptation

نویسندگان

Freek Stulp

Olivier Sigaud

چکیده

There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies. PI is a recent example of this approach. It combines a derivation from first principles of stochastic optimal control with tools from statistical estimation theory. In this paper, we consider PI as a member of the wider family of methods which share the concept of probability-weighted averaging to iteratively update parameters to optimize a cost function. We compare PI to other members of the same family – Cross-Entropy Methods and CMAES – at the conceptual level and in terms of performance. The comparison suggests the derivation of a novel algorithm which we call PI-CMA for “Path Integral Policy Improvement with Covariance Matrix Adaptation”. PI-CMA’s main advantage is that it determines the magnitude of the exploration noise automatically.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptation de la matrice de covariance pour l'apprentissage par renforcement direct

Résumé : La résolution de problèmes à états et actions continus par l’optimisation de politiques paramétriques est un sujet d’intérêt récent en apprentissage par renforcement. L’algorithme PI est un exemple de cette approche, qui bénéficie de fondements mathématiques solides tirés de la commande stochastique optimale et des outils de la théorie de l’estimation statistique. Dans cet article, nou...

متن کامل

Adaptive exploration through covariance matrix adaptation enables developmental motor learning

2 FLOWERS Team INRIA Bordeaux Sud-Ouest Talence, France Abstract The “Policy Improvement with Path Integrals” (PI2) [25] and “Covariance Matrix Adaptation Evolutionary Strategy” [8] are considered to be state-of-the-art in direct reinforcement learning and stochastic optimization respectively. We have recently shown that incorporating covariance matrix adaptation into PI2– which yields the PICM...

متن کامل

Covariance Matrix Estimation for Reinforcement Learning

One of the goals in scaling reinforcement learning (RL) pertains to dealing with high-dimensional and continuous stateaction spaces. In order to tackle this problem, recent efforts have focused on harnessing well-developed methodologies from statistical learning, estimation theory and empirical inference. A key related challenge is tuning the many parameters and efficiently addressing numerical...

متن کامل

Task Scheduling Algorithm Using Covariance Matrix Adaptation Evolution Strategy (CMA-ES) in Cloud Computing

The cloud computing is considered as a computational model which provides the uses requests with resources upon any demand and needs.The need for planning the scheduling of the user's jobs has emerged as an important challenge in the field of cloud computing. It is mainly due to several reasons, including ever-increasing advancements of information technology and an increase of applications and...

متن کامل

What Does the Evolution Path Learn in CMA-ES?

The Covariance matrix adaptation evolution strategy (CMA-ES) evolves a multivariate Gaussian distribution for continuous optimization. The evolution path, which accumulates historical search direction in successive generations, plays a crucial role in the adaptation of covariance matrix. In this paper, we investigate what the evolution path approximates in the optimization procedure. We show th...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1206.4621 شماره

صفحات -

تاریخ انتشار 2012

Path Integral Policy Improvement with Covariance Matrix Adaptation

نویسندگان

چکیده

منابع مشابه

Adaptation de la matrice de covariance pour l'apprentissage par renforcement direct

Adaptive exploration through covariance matrix adaptation enables developmental motor learning

Covariance Matrix Estimation for Reinforcement Learning

Task Scheduling Algorithm Using Covariance Matrix Adaptation Evolution Strategy (CMA-ES) in Cloud Computing

What Does the Evolution Path Learn in CMA-ES?

عنوان ژورنال:

اشتراک گذاری